Decision Theoretic Learning of Human Facial Displays
نویسندگان
چکیده
Changes in the human face occur due to many factors, including communication, emotion, speech, and physiology. Most systems for facial expression analysis attempt to recognize one or more of these factors, resulting in a machine whose inputs are video sequences or static images, and whose outputs are, for example, basic emotion categories. Our approach is fundamentally different. We make no prior commitment to some particular recognition task. Instead, we consider that the meaning of a facial display to an observer is contained in its relationship to actions and outcomes. Agents must distinguish facial displays according to their affordances, or how they help an agent to maximize utility. We show how an agent can learn relationships between observations of a person’s face, the context in which that person is acting, and its own actions and utility function. The agent can then base its decisions on these learned decision-theoretic models, allowing it to make value-directed action choices based, in part, upon observed facial displays. Recent research on the communicative function of the face supports our approach. Psychologists have concluded that facial displays are purposeful communicative signals [1], that the purpose is dependent on both the display and the context of its emission [4], and that the signals vary widely between individuals [4]. These considerations imply that a rational communicative agent must learn the relationships between facial displays, the context in which they are shown, and its own utility function: it must be able to compute the utility of taking actions in situations involving purposeful facial displays. The agent will then be able to make value-directed decisions based, in part, upon the “meaning” of facial displays as contained in these learned connections between displays, context, and utility. Learning these relationships will further allow an agent to adapt to new interactants and new situations. The model we propose is a partially observable Markov decision process, or POMDP, which realises the design constraints suggested by the psychology literature, combining the recognition of facial signals with their interpretation and use in a consistent utility-maximization framework. The parameters of the model are learned from training data using an a posteriori constrained optimization technique, such that an agent can learn to act based on the facial signals of a human through observation. The training is unsupervised: we do not train classifiers for individual facial displays, and then combine them in the model. Rather, the learning process discovers clusters of facial motions and their relationship to the context automatically. The advantage of this approach is threefold. First, we do not need expert knowledge about which facial motions are important. Second, since the system learns categories of motions, it will adapt to novel gestures or displays without modification. Third, resources can be focused on tasks that will be useful for the agent. It is wasteful to train complex classifiers for the recognition of fine facial motion if only simple displays are being used in the agent’s context.
منابع مشابه
Decision Theoretic Modeling of Human Facial Displays
We present a vision based, adaptive, decision theoretic model of human facial displays in interactions. The model is a partially observable Markov decision process, or POMDP. A POMDP is a stochastic planner used by an agent to relate its actions and utility function to its observations and to other context. Video observations are integrated into the POMDP using a dynamic Bayesian network that c...
متن کاملNGTSOM: A Novel Data Clustering Algorithm Based on Game Theoretic and Self- Organizing Map
Identifying clusters is an important aspect of data analysis. This paper proposes a noveldata clustering algorithm to increase the clustering accuracy. A novel game theoretic self-organizingmap (NGTSOM ) and neural gas (NG) are used in combination with Competitive Hebbian Learning(CHL) to improve the quality of the map and provide a better vector quantization (VQ) for clusteringdata. Different ...
متن کاملWho do you trust? The impact of facial emotion and behaviour on decision making.
During social interactions, we use available information to guide our decisions, including behaviour and emotional displays. In some situations, behaviour and emotional displays may be incongruent, complicating decision making. This study had two main aims: first, to investigate the independent contributions of behaviour and facial displays of emotion on decisions to trust, and, second, to exam...
متن کاملComparing Rule-Based and Data-Driven Selection of Facial Displays
The non-verbal behaviour of an embodied conversational agent is normally based on recorded human behaviour. There are two main ways that the mapping from human behaviour to agent behaviour has been implemented. In some systems, human behaviour is analysed, and then rules for the agent are created based on the results of that analysis; in others, the recorded behaviour is used directly as a reso...
متن کاملThe Influence of Emotions in Embodied Agents on Human Decision-Making
Acknowledging the social functions that emotions serve, there has been growing interest in the interpersonal effect of emotion in human decision making. Following the paradigm of experimental games from social psychology and experimental economics, we explore the interpersonal effect of emotions expressed by embodied agents on human decision making. The paper describes an experiment where parti...
متن کامل